Overview

Dataset statistics

Number of variables15
Number of observations15000
Missing cells2573
Missing cells (%)1.1%
Total size in memory6.7 MiB
Average record size in memory471.2 B

Variable types

Numeric7
Categorical6
Boolean2

Alerts

tempo_emprego has 2573 (17.2%) missing values Missing
Unnamed: 0 has unique values Unique
qtd_filhos has 10376 (69.2%) zeros Zeros

Reproduction

Analysis started2021-10-01 13:29:29.478915
Analysis finished2021-10-01 13:29:30.156325
Duration0.68 seconds
Software versionpandas-profiling v3.1.0
Download configurationconfig.json

Variables

Unnamed: 0
Real number (ℝ≥0)

UNIQUE

Distinct15000
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean7499.5
Minimum0
Maximum14999
Zeros1
Zeros (%)< 0.1%
Negative0
Negative (%)0.0%
Memory size117.3 KiB
2021-10-01T10:29:30.393760image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile749.95
Q13749.75
median7499.5
Q311249.25
95-th percentile14249.05
Maximum14999
Range14999
Interquartile range (IQR)7499.5

Descriptive statistics

Standard deviation4330.271354
Coefficient of variation (CV)0.5774080077
Kurtosis-1.2
Mean7499.5
Median Absolute Deviation (MAD)3750
Skewness0
Sum112492500
Variance18751250
MonotonicityStrictly increasing
2021-10-01T10:29:30.699596image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
20471
 
< 0.1%
88011
 
< 0.1%
6291
 
< 0.1%
26761
 
< 0.1%
129151
 
< 0.1%
149621
 
< 0.1%
88171
 
< 0.1%
108641
 
< 0.1%
47111
 
< 0.1%
67581
 
< 0.1%
Other values (14990)14990
99.9%
ValueCountFrequency (%)
01
< 0.1%
11
< 0.1%
21
< 0.1%
31
< 0.1%
41
< 0.1%
51
< 0.1%
61
< 0.1%
71
< 0.1%
81
< 0.1%
91
< 0.1%
ValueCountFrequency (%)
149991
< 0.1%
149981
< 0.1%
149971
< 0.1%
149961
< 0.1%
149951
< 0.1%
149941
< 0.1%
149931
< 0.1%
149921
< 0.1%
149911
< 0.1%
149901
< 0.1%

data_ref
Categorical

Distinct15
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size981.6 KiB
2016-02-01
 
1000
2015-05-01
 
1000
2015-08-01
 
1000
2015-11-01
 
1000
2015-06-01
 
1000
Other values (10)
10000 

Characters and Unicode

Total characters0
Distinct characters0
Distinct categories0 ?
Distinct scripts0 ?
Distinct blocks0 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2015-01-01
2nd row2015-01-01
3rd row2015-01-01
4th row2015-01-01
5th row2015-01-01

Common Values

ValueCountFrequency (%)
2016-02-011000
 
6.7%
2015-05-011000
 
6.7%
2015-08-011000
 
6.7%
2015-11-011000
 
6.7%
2015-06-011000
 
6.7%
2015-12-011000
 
6.7%
2015-04-011000
 
6.7%
2015-10-011000
 
6.7%
2015-07-011000
 
6.7%
2015-01-011000
 
6.7%
Other values (5)5000
33.3%
ValueCountFrequency (%)
2015-03-011000
 
6.7%
2015-09-011000
 
6.7%
2016-01-011000
 
6.7%
2015-02-011000
 
6.7%
2016-03-011000
 
6.7%
2015-01-011000
 
6.7%
2015-07-011000
 
6.7%
2015-10-011000
 
6.7%
2015-04-011000
 
6.7%
2015-12-011000
 
6.7%
Other values (5)5000
33.3%

Most occurring characters

ValueCountFrequency (%)
No values found.

Most occurring categories

ValueCountFrequency (%)
No values found.

Most frequent character per category

Most occurring scripts

ValueCountFrequency (%)
No values found.

Most frequent character per script

Most occurring blocks

ValueCountFrequency (%)
No values found.

Most frequent character per block

id_cliente
Real number (ℝ≥0)

Distinct9845
Distinct (%)65.6%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean8304.8714
Minimum1
Maximum16649
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size117.3 KiB
2021-10-01T10:29:30.887492image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile843.9
Q14181
median8297
Q312403
95-th percentile15846.15
Maximum16649
Range16648
Interquartile range (IQR)8222

Descriptive statistics

Standard deviation4797.780446
Coefficient of variation (CV)0.5777067717
Kurtosis-1.192310779
Mean8304.8714
Median Absolute Deviation (MAD)4110.5
Skewness0.007218453187
Sum124573071
Variance23018697.21
MonotonicityNot monotonic
2021-10-01T10:29:31.058381image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
63566
 
< 0.1%
99486
 
< 0.1%
55736
 
< 0.1%
93516
 
< 0.1%
86356
 
< 0.1%
60626
 
< 0.1%
75975
 
< 0.1%
54945
 
< 0.1%
121065
 
< 0.1%
64865
 
< 0.1%
Other values (9835)14944
99.6%
ValueCountFrequency (%)
12
< 0.1%
22
< 0.1%
31
 
< 0.1%
41
 
< 0.1%
51
 
< 0.1%
71
 
< 0.1%
83
< 0.1%
93
< 0.1%
101
 
< 0.1%
113
< 0.1%
ValueCountFrequency (%)
166491
 
< 0.1%
166481
 
< 0.1%
166472
< 0.1%
166381
 
< 0.1%
166371
 
< 0.1%
166351
 
< 0.1%
166341
 
< 0.1%
166333
< 0.1%
166321
 
< 0.1%
166311
 
< 0.1%

sexo
Categorical

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size849.7 KiB
F
10119 
M
4881 

Characters and Unicode

Total characters0
Distinct characters0
Distinct categories0 ?
Distinct scripts0 ?
Distinct blocks0 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowF
2nd rowM
3rd rowF
4th rowF
5th rowM

Common Values

ValueCountFrequency (%)
F10119
67.5%
M4881
32.5%
ValueCountFrequency (%)
f10119
67.5%
m4881
32.5%

Most occurring characters

ValueCountFrequency (%)
No values found.

Most occurring categories

ValueCountFrequency (%)
No values found.

Most frequent character per category

Most occurring scripts

ValueCountFrequency (%)
No values found.

Most frequent character per script

Most occurring blocks

ValueCountFrequency (%)
No values found.

Most frequent character per block

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size14.8 KiB
False
9140 
True
5860 
ValueCountFrequency (%)
False9140
60.9%
True5860
39.1%
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size14.8 KiB
True
10143 
False
4857 
ValueCountFrequency (%)
True10143
67.6%
False4857
32.4%

qtd_filhos
Real number (ℝ≥0)

ZEROS

Distinct8
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.4323333333
Minimum0
Maximum14
Zeros10376
Zeros (%)69.2%
Negative0
Negative (%)0.0%
Memory size117.3 KiB
2021-10-01T10:29:31.221302image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q31
95-th percentile2
Maximum14
Range14
Interquartile range (IQR)1

Descriptive statistics

Standard deviation0.7466313589
Coefficient of variation (CV)1.726980784
Kurtosis17.97486657
Mean0.4323333333
Median Absolute Deviation (MAD)0
Skewness2.4870666
Sum6485
Variance0.5574583861
MonotonicityNot monotonic
2021-10-01T10:29:31.340220image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=8)
ValueCountFrequency (%)
010376
69.2%
13037
 
20.2%
21376
 
9.2%
3185
 
1.2%
417
 
0.1%
75
 
< 0.1%
142
 
< 0.1%
52
 
< 0.1%
ValueCountFrequency (%)
010376
69.2%
13037
 
20.2%
21376
 
9.2%
3185
 
1.2%
417
 
0.1%
52
 
< 0.1%
75
 
< 0.1%
142
 
< 0.1%
ValueCountFrequency (%)
142
 
< 0.1%
75
 
< 0.1%
52
 
< 0.1%
417
 
0.1%
3185
 
1.2%
21376
 
9.2%
13037
 
20.2%
010376
69.2%

tipo_renda
Categorical

Distinct5
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size1.1 MiB
Assalariado
7633 
Empresário
3508 
Pensionista
2582 
Servidor público
1268 
Bolsista
 
9

Characters and Unicode

Total characters0
Distinct characters0
Distinct categories0 ?
Distinct scripts0 ?
Distinct blocks0 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowEmpresário
2nd rowAssalariado
3rd rowEmpresário
4th rowServidor público
5th rowAssalariado

Common Values

ValueCountFrequency (%)
Assalariado7633
50.9%
Empresário3508
23.4%
Pensionista2582
 
17.2%
Servidor público1268
 
8.5%
Bolsista9
 
0.1%
ValueCountFrequency (%)
assalariado7633
46.9%
empresário3508
21.6%
pensionista2582
 
15.9%
público1268
 
7.8%
servidor1268
 
7.8%
bolsista9
 
0.1%

Most occurring characters

ValueCountFrequency (%)
No values found.

Most occurring categories

ValueCountFrequency (%)
No values found.

Most frequent character per category

Most occurring scripts

ValueCountFrequency (%)
No values found.

Most frequent character per script

Most occurring blocks

ValueCountFrequency (%)
No values found.

Most frequent character per block

educacao
Categorical

Distinct5
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size1.2 MiB
Secundário
8895 
Superior completo
5335 
Superior incompleto
 
579
Primário
 
165
Pós graduação
 
26

Characters and Unicode

Total characters0
Distinct characters0
Distinct categories0 ?
Distinct scripts0 ?
Distinct blocks0 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowSecundário
2nd rowSuperior completo
3rd rowSuperior completo
4th rowSuperior completo
5th rowSecundário

Common Values

ValueCountFrequency (%)
Secundário8895
59.3%
Superior completo5335
35.6%
Superior incompleto579
 
3.9%
Primário165
 
1.1%
Pós graduação26
 
0.2%
ValueCountFrequency (%)
secundário8895
42.5%
superior5914
28.2%
completo5335
25.5%
incompleto579
 
2.8%
primário165
 
0.8%
graduação26
 
0.1%
pós26
 
0.1%

Most occurring characters

ValueCountFrequency (%)
No values found.

Most occurring categories

ValueCountFrequency (%)
No values found.

Most frequent character per category

Most occurring scripts

ValueCountFrequency (%)
No values found.

Most frequent character per script

Most occurring blocks

ValueCountFrequency (%)
No values found.

Most frequent character per block

estado_civil
Categorical

Distinct5
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size968.4 KiB
Casado
10534 
Solteiro
1798 
União
1078 
Separado
 
879
Viúvo
 
711

Characters and Unicode

Total characters0
Distinct characters0
Distinct categories0 ?
Distinct scripts0 ?
Distinct blocks0 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowSolteiro
2nd rowCasado
3rd rowCasado
4th rowCasado
5th rowSolteiro

Common Values

ValueCountFrequency (%)
Casado10534
70.2%
Solteiro1798
 
12.0%
União1078
 
7.2%
Separado879
 
5.9%
Viúvo711
 
4.7%
ValueCountFrequency (%)
casado10534
70.2%
solteiro1798
 
12.0%
união1078
 
7.2%
separado879
 
5.9%
viúvo711
 
4.7%

Most occurring characters

ValueCountFrequency (%)
No values found.

Most occurring categories

ValueCountFrequency (%)
No values found.

Most frequent character per category

Most occurring scripts

ValueCountFrequency (%)
No values found.

Most frequent character per script

Most occurring blocks

ValueCountFrequency (%)
No values found.

Most frequent character per block

tipo_residencia
Categorical

Distinct6
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size907.0 KiB
Casa
13532 
Com os pais
 
675
Governamental
 
452
Aluguel
 
194
Estúdio
 
83

Characters and Unicode

Total characters0
Distinct characters0
Distinct categories0 ?
Distinct scripts0 ?
Distinct blocks0 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowCasa
2nd rowCasa
3rd rowCasa
4th rowCasa
5th rowGovernamental

Common Values

ValueCountFrequency (%)
Casa13532
90.2%
Com os pais675
 
4.5%
Governamental452
 
3.0%
Aluguel194
 
1.3%
Estúdio83
 
0.6%
Comunitário64
 
0.4%
ValueCountFrequency (%)
casa13532
82.8%
pais675
 
4.1%
os675
 
4.1%
com675
 
4.1%
governamental452
 
2.8%
aluguel194
 
1.2%
estúdio83
 
0.5%
comunitário64
 
0.4%

Most occurring characters

ValueCountFrequency (%)
No values found.

Most occurring categories

ValueCountFrequency (%)
No values found.

Most frequent character per category

Most occurring scripts

ValueCountFrequency (%)
No values found.

Most frequent character per script

Most occurring blocks

ValueCountFrequency (%)
No values found.

Most frequent character per block

idade
Real number (ℝ≥0)

Distinct47
Distinct (%)0.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean43.88233333
Minimum22
Maximum68
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size117.3 KiB
2021-10-01T10:29:31.505139image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum22
5-th percentile27
Q134
median43
Q353
95-th percentile63
Maximum68
Range46
Interquartile range (IQR)19

Descriptive statistics

Standard deviation11.27315514
Coefficient of variation (CV)0.2568950711
Kurtosis-1.044419826
Mean43.88233333
Median Absolute Deviation (MAD)9
Skewness0.1726836216
Sum658235
Variance127.0840268
MonotonicityNot monotonic
2021-10-01T10:29:31.679039image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=47)
ValueCountFrequency (%)
40538
 
3.6%
37469
 
3.1%
43458
 
3.1%
32455
 
3.0%
33441
 
2.9%
39440
 
2.9%
38438
 
2.9%
27436
 
2.9%
60435
 
2.9%
46431
 
2.9%
Other values (37)10459
69.7%
ValueCountFrequency (%)
2215
 
0.1%
2326
 
0.2%
2499
 
0.7%
25133
 
0.9%
26177
1.2%
27436
2.9%
28410
2.7%
29377
2.5%
30429
2.9%
31403
2.7%
ValueCountFrequency (%)
6813
 
0.1%
6766
 
0.4%
66127
 
0.8%
65143
 
1.0%
64201
1.3%
63238
1.6%
62241
1.6%
61253
1.7%
60435
2.9%
59308
2.1%

tempo_emprego
Real number (ℝ≥0)

MISSING

Distinct2589
Distinct (%)20.8%
Missing2573
Missing (%)17.2%
Infinite0
Infinite (%)0.0%
Mean7.722634652
Minimum0.1178082192
Maximum42.90684932
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size117.3 KiB
2021-10-01T10:29:31.854938image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum0.1178082192
5-th percentile0.721369863
Q12.973972603
median6.01369863
Q310.12054795
95-th percentile21.43561644
Maximum42.90684932
Range42.7890411
Interquartile range (IQR)7.146575342

Descriptive statistics

Standard deviation6.711188751
Coefficient of variation (CV)0.869028389
Kurtosis3.522176833
Mean7.722634652
Median Absolute Deviation (MAD)3.44109589
Skewness1.686996166
Sum95969.18082
Variance45.04005445
MonotonicityNot monotonic
2021-10-01T10:29:32.018848image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
4.21643835638
 
0.3%
6.93424657530
 
0.2%
7.52054794529
 
0.2%
15.4493150727
 
0.2%
4.51780821926
 
0.2%
5.71780821926
 
0.2%
1.09863013726
 
0.2%
4.59726027425
 
0.2%
1.43287671224
 
0.2%
2.68219178124
 
0.2%
Other values (2579)12152
81.0%
(Missing)2573
 
17.2%
ValueCountFrequency (%)
0.11780821922
 
< 0.1%
0.17808219183
 
< 0.1%
0.24
 
< 0.1%
0.21643835621
 
< 0.1%
0.24109589041
 
< 0.1%
0.24383561642
 
< 0.1%
0.24931506853
 
< 0.1%
0.25205479451
 
< 0.1%
0.254794520510
0.1%
0.26027397267
< 0.1%
ValueCountFrequency (%)
42.906849323
 
< 0.1%
41.217
0.1%
40.786301372
 
< 0.1%
40.575342477
< 0.1%
39.824657534
 
< 0.1%
39.652054798
0.1%
39.487671231
 
< 0.1%
39.282191781
 
< 0.1%
38.405479451
 
< 0.1%
36.865753424
 
< 0.1%

qt_pessoas_residencia
Real number (ℝ≥0)

Distinct9
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2.2064
Minimum1
Maximum15
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size117.3 KiB
2021-10-01T10:29:32.158799image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q12
median2
Q33
95-th percentile4
Maximum15
Range14
Interquartile range (IQR)1

Descriptive statistics

Standard deviation0.9097916729
Coefficient of variation (CV)0.4123421288
Kurtosis6.716274427
Mean2.2064
Median Absolute Deviation (MAD)0
Skewness1.30045517
Sum33096
Variance0.8277208881
MonotonicityNot monotonic
2021-10-01T10:29:32.268688image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=9)
ValueCountFrequency (%)
28181
54.5%
12752
 
18.3%
32551
 
17.0%
41311
 
8.7%
5179
 
1.2%
618
 
0.1%
95
 
< 0.1%
152
 
< 0.1%
71
 
< 0.1%
ValueCountFrequency (%)
12752
 
18.3%
28181
54.5%
32551
 
17.0%
41311
 
8.7%
5179
 
1.2%
618
 
0.1%
71
 
< 0.1%
95
 
< 0.1%
152
 
< 0.1%
ValueCountFrequency (%)
152
 
< 0.1%
95
 
< 0.1%
71
 
< 0.1%
618
 
0.1%
5179
 
1.2%
41311
 
8.7%
32551
 
17.0%
28181
54.5%
12752
 
18.3%

renda
Real number (ℝ≥0)

Distinct9786
Distinct (%)65.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean5697.287057
Minimum118.71
Maximum245141.67
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size117.3 KiB
2021-10-01T10:29:32.426597image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum118.71
5-th percentile939.11
Q12026.11
median3499.72
Q36392.1675
95-th percentile16885.964
Maximum245141.67
Range245022.96
Interquartile range (IQR)4366.0575

Descriptive statistics

Standard deviation8266.816289
Coefficient of variation (CV)1.451009262
Kurtosis131.8811994
Mean5697.287057
Median Absolute Deviation (MAD)1810.345
Skewness8.385628482
Sum85459305.85
Variance68340251.56
MonotonicityNot monotonic
2021-10-01T10:29:32.598513image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
2999.596
 
< 0.1%
9826.316
 
< 0.1%
1272.046
 
< 0.1%
4234.946
 
< 0.1%
43433.946
 
< 0.1%
5402.446
 
< 0.1%
728.966
 
< 0.1%
2610.775
 
< 0.1%
2559.185
 
< 0.1%
11365.885
 
< 0.1%
Other values (9776)14943
99.6%
ValueCountFrequency (%)
118.711
< 0.1%
211.041
< 0.1%
222.872
< 0.1%
238.21
< 0.1%
249.142
< 0.1%
269.091
< 0.1%
275.111
< 0.1%
300.762
< 0.1%
307.482
< 0.1%
316.72
< 0.1%
ValueCountFrequency (%)
245141.671
< 0.1%
179538.81
< 0.1%
172748.391
< 0.1%
166223.852
< 0.1%
154006.231
< 0.1%
140482.941
< 0.1%
121348.32
< 0.1%
119626.381
< 0.1%
107414.212
< 0.1%
102641.071
< 0.1%